34 research outputs found
A hybrid architecture for robust parsing of german
This paper provides an overview of current research on a hybrid and robust parsing architecture for the morphological, syntactic and semantic annotation of German text corpora. The novel contribution of this research lies not in the individual parsing modules, each of which relies on state-of-the-art algorithms and techniques. Rather what is new about the present approach is the combination of these modules into a single architecture. This combination provides a means to significantly optimize the performance of each component, resulting in an increased accuracy of annotation
Constructing a Valence Lexicon for a Treebank of German
"Treebanks allow for the creation of a valence lexicon per side effect. The TüBa-D/Z
valence lexicon has been created in lockstep with the development of the TüBa-
D/Z treebank as such. For each verb encountered in the treebank, the annotators
created a lexical entry that records the valence frames of the verbs contained in
the sentence, unless they are already contained in the valence lexicon as result of
previous annotation. The TüBa-D/Z valence lexicon currently contains a total of
8013 frames for 4896 distinct verb lemmas. Since treebank annotation is still ongoing,
the lexicon will continue to grow.
Such a lexicon has utility in its own right as a resource for lexicalized parsing
and a variety of NLP applications. At the same time, the lexicon can serve as a
source for aiding consistency of annotation and automatic detection of annotation
errors